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I. — On the Dissection of Asymmetrical Frequency-Curves. 

(L) If measurements be made of the same part or organ in several hundred or 
thousand specimens of the same type or family, and a curve be constructed of which 
the abscissa x represents the size of the organ and the ordinate y the number of speci- 
mens falling within a definite small range Sx of organ, this curve may be termed a 
frequency-curve. The centre or origin for measurement of the organ may, if we 
please, be taken at the mean of all the specimens measured. In this case the 
frequency-curve may be looked upon as one in which the frequency— per thousand or 
per ten thousand, as the case may be —of a given small range of deviations from the 
mean, is plotted up to the mean of that range. Such frequency-curves play a large 
part in the mathematical theory of evolution, and have been dealt with by 
Mr. F. Galton, Professor Weldon, and others. In most cases, as in the case of 
errors of observation, they have a fairly definite symmetrical shape 4 '" and one that 

# Symmetrical shapes may of course occur which are not of the norma] or error-curve form. See 
Part II., § 11 of this paper. 
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approaches with a close degree of approximation to the well-known error or probability- 
curve. A frequency-curve, which, for practical purposes, can be represented by the 
error curve, will for the remainder of this paper be termed a normal curve. When a 
series of measurements gives rise to a normal curve, we may probably assume 
something approaching a stable condition ; there is production and destruction 
impartially round the mean. In the case of certain biological, sociological, and 
economic measurements there is, however, a well-marked deviation from this 
normal shape, and it becomes important to determine the direction and amount of 
such deviation. The asymmetry may arise from the fact that the units grouped 
together in the measured material are not really homogeneous. It may happen that 
we have a mixture of 2, 3, ... n homogeneous groups, each of which deviates about 
its own mean symmetrically and in a manner represented with sufficient accuracy by 
the normal curve. Thus an abnormal frequency- curve may be really built up of normal 
curves having parallel but not necessarily coincident axes and different parameters. 
Even where the material is really homogeneous, but gives an abnormal frequency-curve 
the amount and direction of the abnormality will be indicated if this frequency-curve 
can be split up into normal curves. The object of the present paper is to discuss the 
dissection of abnormal frequency-curves into normal curves, The equations for the 
dissection of a frequency-curve into n normal curves can be written down in the same 
manner as for the special case of n == 2 treated in this paper ; they require us only to 
calculate higher moments. But the analytical difficulties, even for the case of n = 2, 
are so considerable, that it may be questioned whether the general theory could ever 
be applied in practice to any numerical case. 

There are reasons, indeed, why the resolution into two is of special importance. A 
family probably breaks up first into two species, rather than three or more, owing to 
the pressure at a given time of some particular form of natural selection ; in attempt- 
ing to procure an absolutely homogeneous material, we are less likely to have got a 
mixture of three or more heterogeneous groups than of two only. Lastly, even 
where the heterogeneity may be threefold or more, the dissection, into two is likely 
to give us, at any rate, an approximation to the two chief groups. In the case of 
homogeneous material, with an abnormal frequency-curve, dissection into two normal 
curves will generally give us the amount and direction of the chief abnormality. So 
much, then, may be said of the value of the special case dealt with here. 

A distinction must be made between the two cases which may theoretically occur. 
If we have a real mixture of two normal groups represented by our abnormal frequency- 
curve, then, theoretically, it is possible to find the two components, and these two 
components must be unique, If they were not unique, a relation of the following kind 
must hold for every value of x :— 
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Between the six constants on either side of this equation an infinite variety of 

relations can be reached by giving x an infinite variety of values, and it seems 

impossible to satisfy this series by the same set of values of the constants. For 

example, let x be very great, and suppose o- x to be the largest of all the quantities 

1 „ ia (*-fti) 2 
°"j> °s> && anc ^ 0V Dividing by -jj^~\ e ~* l<x{ " an( ^ putting x very great we have 
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whence, proceeding to the limit, 

unless a*! = cr 3 or cr 4 . 

The first is impossible by hypothesis, therefore the latter must be true, say 
( j l = <r 3 . This gives us at once c x = c 3 . 

Returning to the original equation, and making x large in it, we see that the first 
two terms become equal on either side. Hence, the second two terms must become 
equal as x approaches infinity, or 
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Dividing again by e"" 25 ?, this leads in the same manner as before to cr 2 = <x 4 , and, 
ultimately, to c 3 = c 4 . 

Our original equation may now be written 
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Put x = ^ (6j + ^3)? then the left-hand side vanishes and, accordingly, the right 
must vanish, but this involves either 

h = K 

or 

h + h = b 2 + 6 4 . 

Similarly, putting # = -J (6 2 + &*)> we find that either 



6 1 = & 



3> 



or 



h + & 3 = h + 6, 



(«)• 



Thus, either the two sets of components are identical, or "(a) is true, 

MDCCCXCIV. — A. L 
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Multiply equation (tj) above by x, x ?J and x B in succession, and integrate the results 
respectively between the limits a and — a c # We find 

(6 l — 63) c } = (h 4: — /> 2 ) c 3 . . . . . . . . -. (/3) ? 



{So-! 8 (&! - h) + 6 X » - ft»«} *i = {3a-/ (6 4 - 6 8 ) + 6 4 » - hi) c 



2f 



reducing by aid of (a) and (/3) to 



and 



{15^^ - 5 3 ) + ^(V - & 3 3 ) + V - bi) c x 

= {15o/ (6, - 6 g ) + 10<r 8 » (6 4 » - 6 8 «) + 6 4 3 - bi} c. 3> 

reducing by aid of (a), (/3), and (y) to the two forms, 

2o-/ + 80-/ + 36/ -f 3/> 3 3 + 4& x & 3 = (8), 

2a"! 3 + 8<r 2 2 + 3/> 2 2 + 3&/ + 4& 2 &j, = ..... . (e). 

Equations (a), (/3), (8), and (e) are four independent equations, which suffice to 
determine b l9 h%, b g , & 4 , as definite functions of <x l3 cr 2 , c 1? and c 2 . But 5 1? 5 2 are in 
general independent of o^, cr 3 , <? l5 and c 2 ; hence it follows that (a) cannot in general 
be true, or we must have 6 1 = 5 3 and 6 2 = 6 4 . That is, a curve which breaks up into 
two normal components can break up in one way, and one way only. 

Now it is clear that in actual statistical practice our abnormal frequency-curve will 
never be the absolutely true sum of two normal-curves ; indeed, if it be not a mixture, 
but an asymmetrical frequency-curve, it is not necessarily a very close approach to 
the sum of two frequency-curves of normal type,- — it may be the limit to an 
asymmetrical binomiaht We must not, therefore, be surprised if more than one 
solution be given by any method of dissection. A mathematical criterion for dis- 
criminating the " true " solution might easily be given. For example, in the method 
of the present paper, we might define that as the " true," or at any rate the " best," 
solution which gave for the compound-curve a sixth moment nearest in value to that 
of the observation-curve. Such a theoretical criterion, however, may not have much 

* The values of the successive moments of the normal-curve are given in § 5 of this paper, and 
permit of these integrations being performed at once. 

f The general form of the limit to asymmetrical binomials is 

where 0, c, and ft are constants, and x is to have positive values only. /3 is always positive. [A 
slightly fuller form is given in the abstract of this paper, ' Roy. Soc. Proc.,' vol. 54, p. 381.] 
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practical value. For after we have made the areas and first five moments of two 
curves identical, their sixth moments will in general be (like their contours) much 
closer together than either are to that of the curve of observations. Added to this 
the great labour involved in the calculation of the sixth moment is sufficient to deter 
the practical statistician, if any other convenient mode — e.g., results of measurement 
on other organs— -suffices in the particular case to discriminate between the solutions 
found. Thus, while the mathematical solution should be unique, yet from the 
utilitarian standpoint we have to be content with a compound curve which fits the 
observations closely, and more than one such compound curve may arise. All we can 
do is to adopt a method which minimizes the divergences of the actual statistics from 
a mathematically true compound. The utilitarian problem is to find the most likely 
components of a curve which is not the true curve, and would only be the true curve 
had we an infinite number of absolutely accurate measurements. As there are 
different methods of fitting a normal curve to a series of observations, depending on 
whether we start from the mean or the median, and proceed by " quartiles," mean error 
or error of mean square, and as these methods lead in some cases to slightly different 
normal-curves, so various methods for breaking up an abnormal frequency-curve may 
lead to different results. As from the utilitarian standpoint good results for a simple 
normal curve are obtained by finding the mean from the first moment, and the error of 
mean square from the second moment, so it seems likely that the present investigation, 
based on the first five or six moments of the frequency-curve, may also lead to good 
results. While a method of equating chosen ordinates of the given curve and those of 
the components leaves each equation based only on the measurements of organs of one 
size, the method of moments uses all the given data in the case of each equation for 
the unknowns, and errors in measurement will, thus, individually have less influence. 
At the same time it would be of great interest to discover whether other methods of 
dissection lead to results identical or nearly identical with the method of moments 
adopted by the present writer. Any other method analytically possible has not yet, 
however, occurred to him ; nor any criterion for distinguishing practically between 
two solutions so close as those of figs. 1 and 2, other than that adopted by Professor 
Weldon when he appeals to the measurements of a correlated organ. 

(2.) In the case of a frequency-curve whose components are two normal curves, the 
complete solution depends in the method adopted in finding the roots of a numerical 
equation of the ninth order. It is possible that a simpler solution may be found, but 
the method adopted has only been chosen after many trials and failures. Clearly 
each component normal curve has three variables : (i.) the position of its axis, (ii.) its 
" standard-deviation" (Gauss's " Mean Error," Airy's "Error of Mean Square"), and 
(iii.) its area. Six relations between the given frequency-curve and its component 
curves would therefore suffice to determine the six unknowns. Innumerable relations 
of this kind can be written down, but, unfortunately, the majority of them lead to 

l 2 
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exponential equations, the solution of which seems more beyond the wit of man than 
that of a numerical equation even of the ninth order, 

(3.) In any given example the conditions will be sufficient to reduce the suitable 
roots of this equation very largely , possibly to two or even one. These limiting 
conditions will be considered later. A suitable root of this equation leads to a 
quadratic for the areas of the two component normal curves. This quadratic is funda- 
mental, and appears to be highly suggestive for the problem of evolution. We have 
two cases : 

(i.) Both its roots are positive. 

In this case the given frequency-curve is the sum of two normal curves. The 
units of the frequency-curve may be considered as composed of definite proportions of 
two species, each of which is stable about its mean. The process of differentiation 
here appears complete. 

(ii.) One root is positive and the other negative. 

The given frequency-curve is now the difference of two probability-curves. The 
probability-curve, with positive area, may possibly now be looked upon as the birth- 
population (unselectively diminished by death). The negative probability-curve is a 
selective diminution of units about a certain mean ; that mean may, perhaps, be the 
average of the less "fit." 

It is possible that in some numerical cases solutions of both the types (i.) and (ii.) 
w r ill be found to exists but I imagine that in most cases of a well-marked and charac- 
teristic asymmetrical frequency-curve, either only one type of solution will exist, or, 
if two types do exist, then one will give a much better agreement with the actual 
shape of the curve than the other. That the two types of solutions should exist side 
by side occasionally is, perhaps, to be expected. In such cases we have examples of 
groups, which are, perhaps, in process of differentiation into separate species by the 
elimination of members round a selected mean. 

(iii.) From the nature of the problem, the case of both roots negative does not 
occur. 

We now pass to the solution of the problem : 

Given an asymmetrical frequency -curve to break it up, if possible, into Uvo com- 
ponent probability -curves, or into Uvo normal curves. 

(4.) Preliminary Definitions and Problems. 

(i.) Given any curve ABC, and the line y f y\ if we take the sum of the products of 
every element of area by the wth power of the distance of the element from the line 
y'y'\ we form the nth moment of the area about the line yy\ 

Clearly, if y be the length of a strip parallel to y'y and x its distance from y'y' 9 then the 
n th moment n= \x n y dx, the integration extending all over ABC, or from A to C in our 
case, where the curve is always bounded by a straight line, AC, perpendicular to yy. 
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lfh be any standard length, say 10 or 100 units., then the nth. moment is of the 
order h H a 9 if a be the area of ABC. It therefore equals p' n ti% where /i n is a purely 
numerical factor. We shall invariably represent it as the product of these three 

factors. 

(ii.) Given the first n moments about yy\ or the coefficients /i v [/%, /x' 3 , // 4 . . . \i n , 
to find the nth moment about yy or the coefficient fi n . 

Let the distance between yy and yy be d = qh, then 



fl tt h H 0L 



'x — d) n y dx, 



or 
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In particular, since ^' = 1, 
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When the line yY passes through the centroid of the curve, and the curve is 
symmetrical about yy \x! x , ju/ 3 , ytt' 5 are all zero. Hence if in this case we take yy to 
the right of yy, or d negative, 
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/*l — <1 

H = P* + l f 

H = 3 2^2 + <f 

Pi = H-\ + 6<zVa + </' 



(2)- 



(iii.) The distance of the centroid of ABC from yy is the ratio of its first moment 
fi\ha to its area a, and = [i-Ji. 

(iv.) To find the successive moments of a given curve about a given line. 

For the purposes of the present problem we require only the first five moments of 
a curve like ABC about a line yy passing through its centroid. The solution may be 
obtained either analytically or graphically according to the accuracy or rapidity with 
which we wish to work. 

(a.) Analytically. — Suppose the frequency-curve to be obtained by plotting up the 
results of 1000 measurements, each unit of length along AC corresponding to an 
equal change in the deviation. Starting from the point C, beyond which no 
individual occurs, we may have in practice, perhaps, 20 to 30 equal ranges of 
deviations before we reach the point A, which terminates the deviations on the left. 
The equal range being taken as the unit of length, let the numbers in the groups at 
1, 2, 3, 4, 5 . . . units of distance from C be y l9 y 2 , y Z9 y^ y 5 . . . 

Then the n th moment clearly equals very approximately 

l» x y Y + 2« X2/ 3 + 3» X y 3 + 4» x y 4 + . . ., 

or since a = 1000, and h may be conveniently taken = 100, 

, _ 1" x y x + 2- x y % T ^xy^ 4» x i, + . . . 



Sufficiently accurate values can then be found for // 1; // 2 , jx/ 3 , jx/ 4 , fi &9 provided we 
know the 2nd, 3rd, 4th, and 5th powers of the natural numbers up to about 20 to 30. 
The values of these powers up to 30 are given later in this paper. 

Knowing the first five moments about the vertical through C, we can find the 
centroid by aid of (iii.) above, and then the moments about the vertical through the 
centroid by aid of equations (1). 

Since fjb } = for the centroid [jl\ = q 9 and therefore we have the following to 
determine the other moments : — 
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The centroid having been found, it may be asked : Why we should not calculate 
fa, fa, fa, fa directly ? The answer lies in the fact that the centroid will not generally 
coincide with a unit division on the deviation axis, and the powers to be calculated, 
instead of being those of two place figures, become in general powers of numbers 
containing three or four figures. Thus the labour of the arithmetic is much increased. 

(b.) Graphically.— If the figure be drawn on a large scale, the moments may he 
found with a fair degree of accuracy by aid of the following process, which has. long 
been of use in graphical statics for finding the first, second, and third moments, of 
plane areas.*. 




It is required to find the moments about O'y' of the curve ABC, bounded by the 
straight line OCA. Take O'y" parallel to O'y and at distance h. Take any line 
PP f , first to O'y from AC to ABC] let the perpendicular from P' on O'y" meet it 
in N', and let 0'N f meet PP' in Q L ; let the perpendicular from Q' on O'y" meet it 
in N", and let ON" meet PP' in Q 2 ; let the perpendicular from Q 2 on O'y" meet it 
in N"', and let ON'" meet PP' in Q s . In this manner a series of points Q l9 Q 2 , Q. 6 , 
Q& Q& are determined. Let these points be determined for a series of positions of 
PP' taken at short intervals from C to A, then all the corresponding Q being joined, 
we obtain curves termed respectively the first, second, third, fourth, and fifth moment- 



* The third moment of a plane area is used in determining graphically the moment of inertia of a 
spindle about its axis. The method described is sometimes attributed to Collignon, but seems to have 
been long in use to find " equivalent figures " in the case of beam sections. 
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curves. Let the areas AQ l L l C 9 AQ 2 L 2 C, &c, be read off with a planimeter, and be 

^i* ^2* ^3 * * ■ J. iien 

/V = a i/°0 



^/ = a6 / a J 

A good draughtsman will construct these curves with great readiness, and if on a 
sufficiently large scale, the results may be read to within the one per cent, error. * 

Equations (4) then enable us to complete the problem of finding the moments 
about a line through the centroid. Or, the first moment being found about 0'y\ and 
so the centroid determined ; we may shift O'y till it passes through the centroid, 
and then proceed to find /x 2 . . . /x 5 directly in the above manner. In this case care 
will have to be taken in reading the areas of the moment-curves, which have now 
pieces of their areas negative, to carry the planimeter point, in the proper sense, round 
their contours. 

(5.) Properties of the probability -curve. 

Let the equation to the probability-curve be — 

Then or will be termed its standard-deviation (error of mean square). c is the 
total number of units measured, or the area of the probability curve. 

(i.) To find the second and fourth moments of the probability-curve about the axis 
of y\ 

Let them be M 2 ' and M 4 '. 
Then 



M/=2 



•a 

yx 9, dx = c X o- 2 . 



*a 



M/ = 2 J yx A 'dx = c X 3cr 
Clearly M 3 ' and M 5 ' are zero. 

[* My demonstrator, Mr. GL U. Yule, has graphically calculated the first four moments of a number 
of statistical frequency-curves, with the object of fitting them to the generalized probability- curve (see 
footnote, p. 74). The method is sufficiently accurate in practice, and I hope soon to have an instrument 
to construct these curves mechanically, designed by him.— February 9, 1894.] 
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(ii.) Now let a be a standard area and h a standard length. Let us use 



IBr 







Equations (2) of Art. 4 (ii.), taking yy as the axis of symmetry of the probability- 
curve, and yy at a distance b to the left, then— 



flJlO 



= be. 



(cr 3 + b 2 ) C. 

(36o- s + V) c. 

(3o- 4, + 66V 2 + 6±) c. 

(15cr 4 6+ 106V + 6 5 )c. 



Now let cJol = z, alb = u, and b/'h = y. 

Then 2, -w, and y are purely numerical quantities, and we have for the first five 
moments round yy — 

M x = yzah, 
M 3 ==y%(l+^)aA 3 , 

M 3 = yh(l + 3u*)ah\ 

M 4 = y% (1 + 6w s + 3tt 4 ) aA 4 , 

M 5 = y 5 2 (1 + 10^+1 5^ 4 ) oA B , J 



(^). 



(6.) We are now in a position to write down the equations which give the general 
solution of our problem. Let the deviation-axis of the asymmetrical frequency-curve 
be taken as axis of x 9 and let the axis of y be a perpendicular on this axis through 
the centroid of the frequency-curve. Let this centroid and the first five moment- 
coefficients about the axis of y of the frequency-curve, i.e., 0, /x, 2 , /x 3 , /x 4 , /x 5 , be found 
either analytically or graphically by the methods suggested in Art. 4 (iv.). 

Then, if the position and magnitude of the component normal curves be given 
by the quantities b l9 c l9 cr } , and & 2 , c 2 , <x 2 , or the corresponding numerics 



7v z i> u i> an( * 7z> z z> u i 



•Z9 



we have, since moments round the vertical axis are clearly additive- 

MDCCCXCIV. — A. M 
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C] + c s = a, 

(Yl z l + 7&) ah = °, 

{y x \ (1 + Mj 3 ) + y z \ (1 + w 2 2 )} aW = ^ 3 a/i s , 

{ri 3z i (! + 3m j s ) + y 3 3 2 2 (i + 3% s )} «^ 3 = fi 8 «A 8 , 

{y^ (1 + 6< + 3<) + yA (1 + 6% 3 + 3%*)} «'** = /*M*. 
{yi 6 »i (1 + ] °V + 15 V) + y/% (1 + 10« a a + 15?/,/)} «A S = /x 5 a/* 6 . 

The first equation here represents the equality of the areas of the resultant curve and 
its components. Reducing to the simplest terms, we have the following six equations 
to find the six unknowns, z x , z 3 , y x , y a , u lt u% : — 

„ i „ — | /p\ 

1 I ™Q ■"*■""*"' .1-. «es««*09se \ \J Fe 

yi%( 1 + V) + ya%(! + V) = ^a ...... (1-0). 

yj 8 *! (1 + 3<) + y 3 3 % (1 + 3^) = /x 3 -(H). 

y l \(l + 6u l " + Su^) + y,\(l + 6ui+3u^) = ^ . . . (12). 
y, 5 ^ (1 + 10< + 15V) + jth (1 + J 0% 2 + 15V) = /*s • • ( 13 )- 

Equations (8)— (13) give the complete solution of the problem.* After several trials, 
I find that the elimination of z v %, u l9 u 2 from these equations, and the determination 
of equations giving y x y 3 and y x + y. z appear to lead to a resulting equation of the 
lowest possible order. 

(7.) Eliminating z 2 between (8) and (9), we have 



Similarly ? 



<V"| "~— • . o » 8 . s « » . . \ "*" **• /* 



7 — 7i /•, r\ 

7i — 7s 



* All my attempts to obtain a simpler set have failed. Equating of selected ordinates, or of selected 
portions of area, or of moments round the axis of a?, all appear to lead to exponential equations defying 
solution. It is possible, however, that some other six equations of a less complex kind may ultimately 
be found. 
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Equations (14) and (15) clearly give the numbers in the component groups so soon 
as y x and y 3 are found. 

Substituting these values of z x and z% in (10) and (Ll), we have two equations to 
determine u^ and u^ in terms of y l9 y 2 . Solving them we find 



y x Uf = - 



7172 



. . . (16). 



72 7173 

These equations clearly give u x 2 and u%, and, therefore, the standard-deviations of 
the component groups when y x and y 3 are known. 
For brevity, put 



2>L = Tl + 72. 



V 



2 



(y*^) 3 , 



jPa = Tir-2- 



Then 



% = /*2-l3Wy2- "SWl +^2 (18), 



V 



2 



/*2 — "J MsM — iPl72 + P2 (19): 



while from (12) and (13) we have 



v, 



Vc 



2 (y x V x - y 2 V 2 ) + ^- - ^- = ( yi - y 2 ) |ip s - ^ - i/* 4 /p 2 } . . (20), 

71 72 



2 (y x % - y 3 %) + 3 « - <) = ( 7l - y 2 ) {|p^ 3 - \p* - } H \p^ . (21). 
"We must now substitute (18) and (19) in (20) and (21). We find 



f 7) 

7i v i — y* v % = (yi — y») j/*a "~ 3 /^ — te 8 + P2 



yiS - yaX 



(yi - ys) - 



Pi 










-~ = (n- 



/*2JPi - *^^ + s Ms - ¥Pi 8 + tPiPa j> 



/V 



7i 72 



t^ 2 — ^ 2 3 



y 2 )^-J + i^ + iPi 3 -i> 3 -2/, 2 + l^^ 



V7l "~* 72/ I 9 "72" + 9P1 — 3 „ 3 M2P1 + 9 H „ 3 ^3 — 3 JP1P2 

L i'2 i J 2 i J 2 



whence, 



a 



3 



_2¥ 






- 2^» + 6p 2 



9(^ 3 



■£■/**) 



5/V^i 

i^2 2 



2O/X3 — 2p x 3 + 4 PlJ p 2 - 



y?2 
15 (2^ 2 ^3 



= 0. 



aO 



^3 



0. 



M 2 
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Write 

\t = 'W — 3 /^> ^5 = 30 ^2/ x 3 — 3 / x 5 ( 22 )> 

and put 

then, multiplying up, the above equations become 

^ ~ ^Hlh — 2 ^3 S - Va + 6 P2 8 = ° ( 24 )> 

5/x 3 2 p 3 - 2p 3 3 + 4p 3 p 2 3 - 20/x 3 p 2 3 - X 5 p 2 3 =0 (25). 

From these equations let us first find p 3 in terms of p%. Multiply the first by p 3 and 
subtract from the second 

±HPz + Ps (W + hPz - 2p 3 3 ) ^ 20/iaPg 3 - X^* = . . . (26). 
Multiply (24) by 2/x 3 and add to (26) we find 

W + Ps (- W + X 4: p 2 - 2j9 3 3 ) - 2/x 3 X 4 p 2 - X 5 p/ - 8/x 3 p 2 3 = 0, 
or 

ql _ 2 ^ - g^S^jPg - X 5^2 3 - 8 M ; 3 3 / 97 \ 

Hence, so soon as p 2 is known, p x = p 3 /p 2 can be found, and then y x and y 3 will be 
the two roots of the quadratic 

7 3 ~Pi7 + P2 = • • • • • • . ' .' . (28). 

Eeturning to (27), substitute this value of p< d in (24), and we have an equation 
containing jpa only, on which the whole solution of the problem now turns. 
This equation is the following one :— 

24^ - 28X 4 p/ + 36/is V - ( 2 Ws - 10A/) JPs 6 - (148^% + 2X 6 »);p a * 

+ (288/* 3 4 -12X,X 5/ x 3 -X^ . (29). 

(8.) Some remarks may be made on this equation. Since this equation is of an odd 
order, one real root may always be found. Further, remembering that X 4 = 9/x 2 2 — 3/x 4 
and X 5 = 30/x 2 //, 3 — 3/x 5 , we see that in the case of a normal curve, for which 
^4 = 3//,/, while //, 3 and /x 5 = 0, all the coefficients of the above equation of the 
ninth order vanish except the first. 

Thus jp a , as we should naturally expect, will be zero. Accordingly, since, with 
increasing symmetry, the coefficients become small, it will be needful to work their 
values out to a greater degree of exactness the slighter the degree of asymmetry. 
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Given that a frequency-curve is compounded of two normal curves, equations 
(29), (28), (27), (14), (15), (16), and (17) form the complete solution of the problem. 

We may throw the whole solution into the following form :— 

Stage L — Find the centroid of the frequency-curve and calculate /x 3 , /x 3 , /x 4 , /x 3 , X 4 , 
and X v 

Stage II.- — Solve (29) for y % and find the corresponding values of p x from (27). 

Stage III — Find the positions of the axes of the component normal curves 
from (28). 

Stage IV.- — The fractions z Y and % that the areas of the normal curves are of 
the area of the frequency-curve are the roots of the quadratic : 



z* 



% 



Pz 



Pi - 4p 2 ' 



= 



(30), 



Stage V. — Since crjh = */v x and crjh = \/% ^he standard- deviations are given 
at once on substituting in (18) and (19). 

(9.) The whole method may be illustrated by the following numerical example : — 
Breadth of "Forehead" of Crabs. — Professor W. F. R. Weldon has very kindly 
given me the following statistics from among his measurements on crabs. They are 
for 1000 individuals from Naples. The abscissae of the curve are the ratio of "fore- 
head " to body-length, and one unit of abscissa = *004 of body-length. No. 1 of the 
abscissae corresponds to # 580 — *588 of body-length. The ordinates represent the 
number of individual crabs corresponding to each set of ratios of forehead to body- 
length. Thus there was one crab fell into the range '580 — *583, three fell into the 
range # 584 — *587, five into the range # 588 — '591, and so on. The average length 
of animals measured 35 millims,, and measurements were recorded to '1 millim. 



Abscissae. 


Ordinates. 


Abscissas. 


Ordinates. 


1 


1 


16 


74 


2 


3 


17 


84 


3 


5 


18 


86 


4 


2 


19 


96 


5 


7 


20 


85 


6 


10 


21 


75 


7 


13 


22 


47 


8 


19 


23 


43 


9 


20 


24 


24 


10 


25 


25 


19 


11 


40 


26 


9 


12 


31 


27 


5 


13 


60 


28 





14 


62 


29 


1 


15 


54 
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This curve is plotted out as the dark continuous line in Plate 1, fig. 1, and is 
clearly asymmetrical. I proceeded to calculate its first five moments in the analytical 
method suggested on p. 78 (a), each calculation being made twice independently. 
I took h = 1, and clearly a = 1000. The moments were taken about the vertical 
through, the point 0, and were calculated by the aid of Table I. of the powers of the 
first 30 natural numbers given at the end of this memoir. The following results 
were obtained :- 

fL{ = 16799 

/* B ' = 304-923 

^ = 5,831759 

/x 4 ' = 116,061*435 
ix 5 ' = 2,385,609-719 

fii, since h = 1, is clearly the distance of the centroid vertical of the frequency- 
curve from the origin O, i.e. = q of p. 77 (ii.). 

The moments about this centroid vertical were now calculated by aid of (1), p. 77. 
There resulted : — 

1jl. z = 22-716,599 

ix s = - 53-874,770 
Pi = 1576-533,413 

\— — 85'205 5 407 
X 5 — - 7920'604 ? 761 

where X 4 , X 5 are given in terms of the fis by (22) of p„ 84. 

Turning now to the fundamental nonio (29), let it be divided by 24, and written in 
the form 

p* + %p* 7 + <*&£ + %Ps 5 + «5^2 4 + a *pi + a i?i + a %v* + a 9 — o. 

Then the coefficients a s , a 3 • . . were calculated, and the following values found ; — 



«2 = 


99-406 


a 3 = 


4,353-742 


a 4 = — 


423,696 


a 6— - 


t5 ? / \J uy) DO 


a 6 = 


X i t/ jZii/Oji/ X 1 


a 7 = 


1,232,409,400 


a 8 = — 


957,080,900 


a 9 = — 


24,451,990,000 
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Put p z = 10x and divide by 10 9 we then have for the fundamental nonic the 
following equation, where only three decimal places are retained : — 

X 9 + " 994 X 7 + 4'354 x fi - 42-370x 5 - 37-029 x 4 + 119-299x 3 + 123-241X 8 

— 9'571x - 24-452 = 0. 

After a somewhat laborious calculation, the values of Sturm's functions f{x)> 

fx (x)> f% (x)> /s (x)> Mx)> /s (x)> /a (x)> fi (x). /s (x). /o (x) were ascertained and gave 
the following results : — 



oo 



oo 



/l ( w ) = + 

/.( 

f. ( co ) = + 
/ fl ( oo ) = + 

/ 7 ( °° ) = + 
/ 8 ( <X) ) : = : - 

/ 9 ( CO ) = - 

3 changes. 



oo 



= 4- 



./■(- 

A ( °° ) = + 

/.(») = + 

/« ( °° ) = - 

/ 7 (_Q0) = + 

/ 8 (») = + 

/9(°° 

6 changes. 



Thus there are 6 — 3 = 3 real roots. 

These three real roots were then localized as follows : — 

Two roots between and — 1, Xi an d Xa* 
One root between and 1, Xs« 

As successive approximations, I found : — 



To Xi : 


- 1, 


— -89, 


- -870, 


- -8757, 


» Xa : 


-•5, 


— -65, 


- -670, 


- -6724, 


» Xz '• 


•5, 


•40, 


•422, 


•4170. 



With sufficient accuracy we may then take for the values of p. z : — 

1st solution, jp 3 = — 8*757. 
2nd „ i>a= — 6724. 
3rd „ _p 2 = 4-170, 



88 PROF. K. PEARSON ON THE MATHEMATICAL THEORY OF EVOLUTION. 

Discussion of first solution. p 2 = — '8*757. p 3 was first calculated from (27) on 
p. 84, and then^j = p 3 /p 3 found. There resulted : p ± = — 1*027. 

The quadratic for y l9 y 2 , which are here identical with b lf 6 2 (the distances of the 
centroids of the component probability-curves from the centroid vertical of the 
frequency-curve), is :— 

y 2 + V027y — 8757 = 0, 

whence 

y, = — 3-517, y 3 = 2-490. 

The values of z } and % were now found from (14) and (15) of p. 82. 

z l = -4145, % = '5855, 

thus the numbers of individuals in either group are respectively 

e x = 414*5, c 2 = 585*5. 

The values of the standard-deviations, or l and cr 2 , were now determined from 
(18) and (19), where, since h = 1, v x = cr^, and % = cr 3 3 . At the same time the 
maximum ordinates of the component probability-curves, y l and y 25 were found from 

_ % 





c i 




Vl - ^(27T) O", 


There resulted 






o-! = 4*4685, 




y 1 = 37-008, 



v/(27r) Org 



cr 2 = 3*1154. 
u o — — -* / 4fc y / o« 



Thus the 1st solution may be summed up as follows :■ 



1st Component. 


^jllCt v. 


Jomponent. 


C r = 414-5, 


c a = 


585-5. 


by = — 3-517, 


6 3 = 


2-490. 


o-y = 4-4685, 


er 2 = 


3-1154. 


yi = 37-008, 


2fe = 


74-976. 



These two normal curves were now drawn by aid of the Table II., which was 
calculated afresh for this purpose from the exponential/* These curves are plotted out 
in fig. 1, and their ordinates added together give the resultant curve. It will be seen that 
this curve is in remarkably close agreement with the original asymmetrical frequency- 
curve, an agreement quite as close as we could reasonably expect from the com- 

* I have always found it more convenient to work with the standard- deviation than with the probable 
error or the modulus, in terms of which the error-function is usually tabulated. 
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parative smallness of the number of individuals dealt with, and the resulting fact 
that the observation-curve can at best only be an approximation to the true 
resultant. 

2nd Solution. — Precisely similar calculations were undertaken for the value 
p 2 = — 6*724, and it will, accordingly, be sufficient to cite the final conclusions 
here. 

Quadratic for y : y 2 — *3412y — 6*724 = 0. 

1st Component. 2nd Component. 

Ci = 467*2, c 2 = 532*8. 

6 1 = 2*769, b 2 = — 2*428. 

cr l = 2*878, cr s = 4*7702. 

y } = 64*764, ya = 44*559. 

These component-curves are drawn in fig. 2, and their ordinates added together. 
We see that we have again broken up our asymmetrical frequency-curve into two 
probability-curves, whose sum is a very close approximation to the original curve. 

3rd Solution : p% = 4*170. 

While the first two solutions have been additive, this solution makes y { and y 2 
(p 2 = y { y i2 ) of the same sign, or the centroids of the component curves fall both on the 
same side of the centroid vertical of the frequency-curve. Accordingly the area of 
one of them must be negative, and the solution promised to be a subtractive one, i.e., 
to represent the frequency-curve as the difference of two normal curves. 

Determining p 3 and then p L from (27), we find ^ = — 3*605 ; hence 

y 1 + 3'605y + 4*170 = 0. 

The roots of this equation are, however, imaginary. In the case of crabs' foreheads, 
therefore, we cannot represent the frequency-curve for their forehead lengths as the 
difference of two normal curves. 

(10.) So far as the nonic is concerned, our work is now accomplished. Taking the 
biologist's measurements and assuming them to be the chance distribution of two 
unequal groups about two different means, then one or other of our solutions is the 
correct answer. Applying the test of the sixth moment, we find for the observations 
/jl g = 177,004, while for the first solution it is 188,099 and for the second solution 
192,446. According to this test, the first solution is the required one,*" but, as we 
have noticed, the two solutions are themselves much closer together than either to 

* The theory of correlation will here, perhaps, confirm this result. Professor Weldon tells me that 
the first and not the second solution is in good accordance with his other measurements. 

MDCCCXCI V. — A. N 
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the observations (see p. 75). In fact, the contours of the compound-curve for both 
solutions are very close together, and neither differs more from the observations than 
most normal curves differ from symmetrical frequency-curves in statistical measure- 
ments of this kind. 

The contours are so close that, notwithstanding we have demonstrated a theoretical 
uniqueness for the solution of the problem (see p, 72, et seq.), we see that, from the 
standpoint of practical statistics, it is possible for the given material to be broken up 
into more than one pair of normal curves. Thus the problem indeed becomes some- 
what arbitrary — at any rate till the asymmetry of the frequency-curve becomes much 
more marked than is the case with that of the foreheads of Naples crabs. Indeed, 
although the method adopted leads to only two solutions, it is quite possible that 
pairs of component normal curves might be tentatively found lying in the neighbour- 
hood of those determined by the above solutions, which w r ould give resultant-curves 
fairly close to the frequency-curve. Professor Weldon had, indeed, found by repeated 
trials one such solution, but this solution differs widely in the third and higher 
moments from the observations ; it cannot, therefore, be considered to have the same 
justification as those given by the present theory. Granted that the original obser- 
vations represent a mixture of tw T o species varying about their mean according to 
exact normal curves, our method gives two solutions 9 and two only. Without corre- 
lated measurements, it might be difficult to discriminate between these solutions— at 
any rate from the standpoint of practical statistics. The perhaps over-fine theoretical 
test of the sixth moment decides for the first solution. 

II.— -The Dissection of Symmetrical Frequency-Curves. 

(11.) Another important case of the dissection of a frequency-curve can arise, when the 
frequency-curve, without being asymmetrical, still consists of the sum or difference of 
two components, i.e., when the means about which the component groups are distributed 
are identical. This case is all the more interesting and important, as it is not unlikely 
to occur in statistical investigations, and the symmetry of the frequency-curve is 
then in itself likely to lead the statistician to believe that he is dealing with an 
example of the normal frequency-curve. It seems to me that without very strong 
grounds for belief in the homogeneity of any statistical material, we ought not to be 
satisfied by its representation by the ordinary normal curve, simply because our 
results are symmetrical and fit the normal curve fairly well. We ought first 
to ascertain whether or not they would fit still better the sum or difference of two 
normal curves. This, at any rate, is a first stage to demonstrating the homogeneity 
of our material, although possibly our test for tw r o may fail, not because our material is 
homogeneous, but because its heterogeneity is multiple rather than double. * 

* Symmetry might arise in the case of compound frequency-curves, even without identity of the 
means of the components. In this case, for two components we should have for different means, 
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We will now modify the results of our previous investigation to suit the case of an 
asymmetrical frequency-curve which has arisen from the superposition of tw r o normal- 
curves having the same axis. In this case if we unite, b x = & 3 = 0, v { = <rjh 
(= u^), i\ 2 = o-Jh (= « 3 y 3 ) in Equations (8) to (13) we have (9), (11) and (13) 
identically satisfied, and (8), (10), -and (12) become 

y .-j— g^ I— JL .......... (ol j, 

equality of component group-totals and of their standard-deviations. This equality seems less likely 
than equality of means and divergence of totals and standard-deviations. Should it exist, however, we 
fall back on a sub-case of the general case we have already dealt with. We need only, in Equations 
(8)-(13), put z 1 = %, Yi = — Yg> U\ = %» and we have 



whence 



or, 



*i = H = i 7i 3 (1 +"i 2 ) = /^ 7i 4 ( j + &<i a + 3 V) = /t 4 , 



7l ~ 1 — 2 J ^- ITCW^) ^ > 



c l — ^2 — 2 rt j 



j 1 = -j ! = i/w-ftU 



•.-»{#"f1U^- 



1 * 



The possibility of the solution clearly depends on 3/li^ being greater than /i 4 . 

The following is an example of this special case. Mr. Merkiman gives some results for American 
target practice, on page 14 of his Text Book on Least Squares. He does not seem to have noticed that 
the resulting- curve is very far from a normal-curve. I find that for these observations 

H\= 6-482 /*!= 

/*' 3 = 44-502 ^ 3 = 2-486 

/ 3 = 320-582 ^= -104 

y 4 = 2405-094 ^= 15793. 

The smallness of ju, s indicates general symmetry ; assuming then that the shots were fired in two groups 
with equal precision, I find g 1 = c 3 and \ = — • 5 3 almost exactly. 

We have accordingly 

fy = - & 3 = 1-082, 

a l = °*2 = 1*1'^> 

For the 1000 shots as a whole a = 1*577.] 

Allowing for a uniform error of defective sighting amonnting to '482, we find a compound-curve 
fitting closely Mr. Mjerriman's figure, and indicating that the gun was aimed at the centres nearly of 
divisions 5 and 7, and not at that of 6. Six was possibly white, 5 and 7 black. Like results of course 
would arise from a change of sighting about midfiring. 

1ST iu 
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<vi ^i —f— .i-o^o ; — Wo ......... (oZ), 

^1^1 I %^2 == 3" A 1 4 * ' • • ' • • * * \^3). 

Clearly we require one more equation. At first sight it might seem that a fourth 
equation would come readily, from the fact that the mid-ordinate m of the frequency- 
curve is the sum of the mid-ordinates of the component probability-curves. 

This leads to 



LI /v 



+ ■■/;oV" = m 



or 



? 



7;;+7k= M ' < 34 >> 

if 

m = \/(27r) mA/a. 

But besides the disadvantage of throwing our solution back on the correctness with 
which we may have observed measurements of one size only, namely, the mean, the 
result of eliminating between (31)-(34) leads to an equation of the eighth order. To 
avoid this, it seems easier, as well as more accurate,* to take as the fourth equation 
that obtained from the sixth moment. 

Let n C) ah G be the sixth moment of the given frequency-curve about its axis of 
symmetry, thent 

[jL ( .ah G = locr/V^ + l5or 2 °c 2 , 
or, 

Wi* + ^ = T5 H (35). 

The solution of (31), (32), (33), and (35) is easy. 
Eliminating % we have, writing w x = v 1 2 , w. 2 = v/, 

h K — '%) — H — w *> 

Z l W l K - IV 2 ) = -l-fi, h - /Xo^o, 

whence 

__ T5 Mfl — 3 W>2 _ 3 M-l ~ ^2^2 



if; 



j -V Mi - Ma'^s ^2 ~ " ? 2 



* Because our equation then depends on all the observations. 

t Generally, if M 2r be the 2r moment of a probability-curve about its axis 

M 2r = (2r - I) ^M,,._,, 
or, 

M 2r = (2r - 1) (2/- - 3) . . . 5.3.1ff 2r c. 
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Thus 

The two roots of this quadratic are clearly w l and w%, so that the complete solution 
is 

Ci — — Ot . Gn — — Ow ~ , 

, L w l — w 2 n\ — w % 

where w x and w a are roots of 

(fa - 3^ 2 3 ) w* + (^ 2 /x 4 — |/x 6 ) w - (^ — liwa) — ° • • " • ( 86 ) 

(12,) Now we may note several general points about these equations. 

Let w x be the greater root, then if 

(i.) /x 3 lie between w Y and w 2 , c Y and c 2 are both positive, or the frequency-curve is 
the sum of two normal curves. 

(ii.) /x 2 > w l9 c x is positive and c 2 negative, or the greater component group is 
positive, we have then a real difference solution. 

(iii.) fx % < w 2 , c x is negative and c 2 is positive, or again the greater component group 
is positive, or we have a real difference solution. 

Obviously if ^ = 3/x 2 2 , and /x 6 = 5/a 2 jlc 4 , the coefficients of the quadratic (36) all 
become zero, but these are just the conditions which would be satisfied if the 
frequency-curve were a true normal curve. This gives for all practical purposes a very 
sufficient test of whether a given symmetrical frequency-curve is a true normal curve 

If ju, 4 be not equal to 3/x 2 3 , and /x 6 be not equal to 5/x 3 /z, 4 , then we have no right to 
assume that a symmetrical frequency-curve refers to homogeneous material. We must 
then investigate whether a better result cannot be obtained by treating it as two 
superposed normal curves having the same axis. 

The quantities 

_ /^j-3/V ,j c — & ""- 5 /^4 

I propose to call the excess and defect of the frequency-curve. The excess measures 
the excess of one-third of the fourth moment over the square of the second 
moment ; the defect measures the defect of the fourth moment from one-fifth the 
ratio of the sixth moment to the second moment." 5 '" Here " excess" and "defect" 
are used in the algebraic sense, and may take either sign. They appear to be a good 

* The introduction of the factor l//* 3 3 into both excess and defect is to preserve a relative as dis- 
tinguished from an absolute measure of divergence. 
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measure for practical purposes of the divergence of a given symmetrical frequency- 
curve from the normal type. 

We may now express the quadratic (36) in terms of e^ and e 2 , and analyze the 
results according to the character of the excess and defect. 

The quadratic becomes 



W X 2, w 



3ci( — ) —%— + e 2 — 3€ l (l + e l ) = 0. 

\ fa/ fa 

This gives 

w _____ e 3 ± ^/{(e z - 6 ei f + 36e_ 8 } , . 



# « $ § * 



^2 6e l 

We have the following cases : 

(i.) € x and e 2 both positive. Then the values of w are both real, but they must 
also be both positive, otherwise ^ and o- 2 would not be real It is necessary, there- 
fore, that 

of 

e 2 < 3e 1 (1 + Cl ). 

(ii.) e x and e 2 both negative. Then w will be real if, when 

v/(-*i)<l> 
( — e 2 ) does not lie between 

6(-e 1 ){l + v /(-€ 1 )} 
and 

6(-«i) (l-y(- ei )}. 
If 

then we must have 

(-e a )>6(- ei ) {1 + a/(-«i)}- 

Further, in order that w may have both values positive, we must have 

(-e 3 )> {-e 2 -6(-e l )r-S6(~, 1 f ! 
or 

(-e a )>3(-e 1 ){l -(-<*)}. 

This latter condition is clearly satisfied if 

v/(-ej) > J. 
On the other hand, if 
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it is easy to see that 

3 (-0(1- (-e,)} 
is less than 

6(-c ] ){l-v / (-^.)}. 
Hence, our final conditions are 

\/(-€i)> h 

then 

(-e 3 )>6(-6 1 ){l + y(-e 1 )}; 
but if 

then either 

(- 6g )>6(-€ 1 ){l + v/(-«i)}. 

or it must lie between 

3 (-0(1 -(-e,)} 
and 

6(-e J ){l + v / (-^)}- 

(iii.) € X positive and e 2 negative; if the values of w are real, one must be negative, 
and therefore the solution impossible. 

(iv.) e x negative and e. 2 positive ; if the values of %o are real, one must be negative, 
and therefore the solution impossible. 

Thus we conclude : 

If the excess and defect are not zero, the frequency-curve, although symmetrical, is 
not normal. If the excess and defect are of opposite signs, then the frequency-curve 
cannot be broken up into the sum or difference of two. normal curves with common 
axis. The frequency-curve, if compounded of normal-curves at all, is of a higher and, 
more complex character. If the excess and defect are of the same sign, then, 
provided certain relations hold between the numerical values of the excess and defect 
given in (i.) and (ii.) above, there is a real solution of the equation w T hich resolves the 
frequency-curve into two components. 

(13.) I propose to illustrate this discussion by the consideration of a numerical 
example. Professor Weldon has kindly complied with my request for the numerical 
details of the most symmetrical curve deduced from his measurements of Naples 
crabs by placing the following statistics for a shell measurement — No. 4 of his series 
— at my disposal. The resultant-curve and the corresponding normal curve are 
pictured in fig. 3 (Plate 3). Clearly, from the ordinary statistician's standpoint, we 
could not expect a more symmetrical result, or a closer graphical agreement, with the 
normal curve. But is this a real or merely an apparent agreement? The answer is, 
as we shall see, vital for the interpretation to be put on Professor Weldon's results. 
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Crab Measurements. No. 4. (Total Number of Crabs = 999.) 





_™.,..._,..-. 




xv 0dC1SSo6» 


Ordi nates 
(1 unit = 1 crab). 




1 


1 




u 


3 




3 


5 




4 


11 




5 


40 




6 


55- 




7 


98 




8 


121 




9 


152 




10 


147 





'". . 


■ 


~ ™ ' " "■ " ~ 


Abscissae. 




Ordinates 


(i 


unit = 1 crab). 


11 




126 


12 




82 


13 




72 


14 




41 


15 




28 


16 




8 


17 




7 


18 







19 







20 




2 



The first six moments were calculated exactly as in the previous case of § 9, by aid 
of Table I., except that a now equals 999, and we go a stage further to // 6 and /x, 6 . 
h equals unity as before. We have 



Pi — 
P$ — 



4 



p6 ~ 



9-684,684 

101-3022 

1,129-9971 

13,334-0710 

165,488-8438 

2,150,845-6867 



Pl = 





[l 2 = 


7-5092 


Pg, == 


3-4751 


Pi — 


176-7280 


Pa ~ 


271-6007 


Pf, — 


7,919-2781 



These results give for the position of the centroid d = /x/ = 9*6847, and for the 
standard-deviation <r = y^ = 27403. This gives the modulus 3*874 ? and the central 
ordinate of the normal curve 145 # 44. The modulus, as calculated from the mean 
error, is 3*8634, so that the agreement is very close. The normal curve in fig. 3 is con- 
structed from the values d = 9*6847, or = 2*7403, and y = 145*44 by aid of Table IL 

The following additional quantities were now calculated :— 



/*4 "~" 3/^2 — 



H 



X 



— 10/x^ 8 = 



l 5 

6 2 



7*5637 
»044 ? 712 

- 22*6911 
10*6485 

- 31*9455 
1283*8486 

*606 ? 45 



If we had a perfect probability-curve, /x 3 , /x 5 , /x 4 — 3/x/, and /x 6 — 5/x^ 4 ought to be 
zero. This, of course, we should not expect in any actual set of observations, but the 
comparative smallness of /x 3 , ju, 5 , X 4 , X 6 , e 3 , and e s shows a very fair approximation to 
the symmetry of the normal curve in these results, 
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Since e 3 > 3c! (1 + e x ), we see that the roots (37) of our p. 94 are both positive, 
and accordingly it is possible to break up the observation-curve into two normal 
curves with coincident axes. 

Calculating the two values of w we have 

^= 3-50971, W% = 1-01148, 
whence from p. 93 : 

Cl = — a X -0046, c 2 = a X 1*0046, 

0-!== v^ X 3-50971),' o~ 2 = y 7 (/x 2 X 1*01148) 

or 

c x = - 5, # c 2 = 1004, 

o-j = 5*134, cr 2 = 2756. 

For all practical purposes the second group gives the normal curve (c = 999, 
o- = 2*740) of the set of observations ; that a half per cent, of Crabs have been 
removed by selection about the same mean is not large enough to be significant in 
measurements of the kind we are here dealing with. So far, then, we may say that 
No. 4 of Professor Weldon's measurements cannot be treated as the sum or difference 
of two normal curves having their axes coincident with any substantial improvement 
on the normal curve peculiar to the original group. 

(14.) Hitherto we have used " Crab Measurements No. 4" to illustrate the dis- 
section of symmetrical frequency-curves, but a little consideration shows at once that 
this judging of symmetry by the eye is very likely to be fallacious, and No. 4 may, 
after all, break up into two normal curves with non- coincident axes, Should these 
two curves correspond to practically the same groups as in the case of the " Fore- 
heads," then we shall have demonstrated that the asymmetry of that frequency-curve 
is in all probability due to a mixture of two families in the Naples Crabs and not a 
result of differentiation going on in one homogeneous species. The apparent symmetry 
of No. 4 weighs nothing in the balance, as may be readily tested by adding together 
two normal curves with not widely divergent axes or totals. 

What we have been investigating, therefore, in § 13 is really only the special case 
in which the method of our first investigation would fail, owing to the coincidence of 
the axes of the component normal curves-— a coincidence which is improbable a priori. 

I, therefore, proceeded to form the nonic for No< 4, a result which requires only the 
values of /x a , X 4 , and X 5 already given. t 

The nonic being 

Pi + %^2 7 + «aPs° + <*£>% + a hVz + a&z + ^ 7 iV + «aPa + c h = 0> 

* The nearest whole number is here taken for the Crabs in each group, 
t The arithmetic throughout was of course of a most laborious character. 
MDCCCXCIV. — A, Q 
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the coefficients were- 






a 



3 



<% = 

Cl~ = 

a 6 = 

« 7 = 
a 8 = 
a g - 



26-47295. 
18-11448. 
325-54964639. 
1604-777825,114, 
977-342,6614. 
3154-2006888. 
4412-284,2437. 
1761-180374. 



Writing p. 2 — — x> we have f° r the nonic f(x) and its first derived function* _/j (x) 
the following expressions — 

/( x ) = x 9 + 26-472,95x 7 — 18-114,48x° 

+ 325-549,646x 5 — 1604-777,825^ 

+ 977-342,661x 3 + 3154-200,689x' 2 — 4412-284,244x 

+ 1761-180,374 = 0, 

/i (X) = X 8 + 20-590,07 X 6 - 12-076,32 X 5 
+ 180-860,915x 4 — 713-234 } 589x 3 

+ 325-780,887x 2 + 700-933,486x 
— 490-253,805. 

The Sturm's functions were now formed, and with the following results — 



and 











X= co - 


/(x)= • ■ • • + 


fM= ■ 


. . . + 


f(x) = ■ 









/»(x)= • 








+ 


/*(x)= ■ 








. 4- 


Mx)= ■ 








+ 


fa (x) = • 








• + 


Mx) = • 








— ■ 


fs (x) = 











fa (x) = • • • 


-j- 


Totals 






4 


-. change 



x = o. 



X 



— 00 . 



+ 



+ 
+ 

+ 



+ 



+ 

m«iJ«n"ia 



4 changes 5 changes. 



Thus the nonic has one root of x between and — oo. and no roots between 
and + oo . In other words it has 8 imaginary roots and only 1 real one. 



# Divided by the factor 9< 
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This root was now localized. Putting p 3 = to/x' ]n ^ ie original nonic, I easily 

found x to lie between and 1, then between '15 and '16, and by a succession of 

approximations to be *1533, and finally '15326. 

Thus 

p % = 1-5326. 

p s was then ascertained from equation (27) of p. 84, and finally p x = pjpo, was 
found to be 2*17245. The quadratic (28) for y was then : 

/ — 2'17245y + 1-5326 = 0, 

which has both its roots imaginary. 

Thus, considerably to my surprise, but greatly to my satisfaction, it was demonstrated 
that there is no solution whatever of the problem of breaking up the curve of No. 4 
measurements into two normal components. 

All nine roots of the fundamental nonic lead to imaginary solutions of the problem. 
The best and most accurate representation of No. 4 is the normal curve of fig. 3. 

The result of this investigation seems to me most important. Professor Weldon's 
material is homogeneous, and the asymmetry of the " forehead " curve points to a real 
differentiation in that organ, and not to a mixture of two families having been 
dredged up. 

On the other hand, I cannot think that for the problem of evolution the dissection 
of the most symmetrical curve given by the measurements is unnecessary. There 
will always be the problem : Is the material homogeneous and a true evolution going 
on, or is the material a mixture ? To throw the solution on the judgment of the eye 
in examining the graphical results is, I feel certain, quite futile. 

Whenever in measuring a series of organs the results give an asymmetrical curve, 
we must accordingly proceed as follows :— 

Stage (i).— Break up this asymmetrical curve into components ; if there are several 
solutions, the theory of correlation or the test of the sixth moment will, perhaps, 
enable us to say which is the most satisfactory. 

Stage (ii). — Endeavour to break up the most symmetrical curve ; if it cannot be 
broken up, either into normal components with non-coincident axes or normal com- 
ponents with coincident axes, the material is homogeneous and the asymmetrical curve 
points to a true differentiation in the organ to which it refers. If, on the other hand, 
the most symmetrical frequency-curve does break up, then if the numbers in its 
component groups be the same (or practically the same) as in those corresponding to 
the asymmetrical curve, we are really dealing with a mixture of heterogeneous material, 
and we shall have ascertained the proportions of the mixture. If the numbers 
should not be the same, then we cannot assert that we have a mixture, but we have 
found a case of differentiation in both organs at the same time.* 

* Beetillon has found a double-humped frequency- curve for the height of the inhabitants of the 

o 2 
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These stages seem to represent the mathematical treatment of this portion of the 
problem of evolution. 

(15.) Although the nonic corresponding to "Crabs No. 4," has no real negative 
root, I found on tracing its value for values of _p 2 between and — 2, that near 
p 3 = — '82 it reached a minimum value of about 199 as compared with about 1761 
at < 1254 at — 2. Here then was, as it were, a tendency towards a root, and 
the question occurred to me whether this " tendency " in any w 7 ay corresponded to 
the groups into which the "foreheads" were differentiated. I therefore investigated 
the root of the first derived function of the nonic lying about — '82, and found it 
to be — '8497. This led to p l from equation (27) being — 5*2521, whence 

y 2 + 5*2521y — '8497 = 0, 
or 

7l = -15705, y 2 = — 5-40915. 
Whence nearly 

z l = -972, % = -028, 
or the numbers in the two groups are 

c x = 971 and c 2 = 28. 

Clearly even this "tendency to a root" in no way fits either solution of the 
"forehead" case, and No. 4 measurements neither break up, nor have they even 
a tendency to break up, in the same manner as the " foreheads." Since the nonic 
must always have a "tendency" to two real roots at a time, we may note that the 
»other root to which it may be said to tend, or for which f(p%) is a minimum, lies 
between — *9 and — 1, and is just as insignificant as that investigated above. We 
may say that not only is the material of No. 4 homogeneous, but it has not even a 
" tendency " towards heterogeneity. 

III. 

(16.) The object of the present paper being solely to illustrate a general method 
for the reduction frequency-curves to normal types, and not a biological investigation, 
it might suffice to stop at this point, when the rules for the reduction of symmetrical 
and asymmetrical curves have been given and illustrated. But it must be remembered 
that the method depends upon the solution of a nonic, and that the variety presented 

department of the Doubs. Mr. Bateson lias found a double-humped curve for the elaspers of Earwigs. 
Without the investigation of measurements of another organ, it seems impossible to say whether the 
inhabitants of the Doubs, as Beetillon supposes, are a mixture of races, or Mr. Bateson's earwigs were 
really homogeneous. In either case our methods of investigation would show the proportions belonging 
to each group of the mixture, or to each group of the differentiating species. 
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by the roots of this equation suggests very considerable divergences and peculiarities 
as likely to arise, when a considerable number of frequency-curves are dealt with. 

The discussion of the case of Crabs must not be taken as indicating that the 
incidents of this case will be generally true for other groups of biological measure- 
ments, until a very great variety of such groups of measurements have been 
mathematically analyzed. 

In order to throw more light on the general question, I have added the following 
analysis for the case of Prawns, the measurements for which were kindly placed at 
my disposal by Mr. H. Thompson, who has been making elaborate measurements of 
1,000 specimens in the Zoological Laboratory of University College, London. 



Palcemon serratus. — Measurements in 998 ? specimens (adult) from penultimate 

to hindmost tooth on the carapace. 



Measurements reduced 




Measurements reduced 




to thousandths of body 


Number of specimens. 


to thousandths of body 


Number of specimens. 


length. 




length. 




27 


1 


49 


25 


28 





50 


17 


29 





51 


11 


30 





52 


8 


31 


1 


53 


4 


32 





54 


1 


33 


3 


55 





34 


3 


56 





35 


4 


57 


1 


36 


11 


58 


1 


37 


24 


59 





38 


38 


60 





39 


56 


61 





40 


80 


62 





41 


105 


63 





42 


121 


64 





43 


117 


65 


1 


44 


108 


66 





45 


77 


67 





46 


69 


68 





47 


62 


69 


1 


48 


48 







The novel and somewhat remarkable feature in these results are the " giants " at 
65 and 69. To neglect these giants, as in some degree anomalous, would, no doubt 
be convenient, so far as the analysis is concerned, and would lead to a simpler reduction 
of the group. They have, however, been retained as among the data given to me, 
and their presence affords an interesting illustration of the various singularities which 
may arise in the solution of the fundamental nonic. 



(17.) The curve (see fig. 4) given by the observed numbers will be at once seen to 
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be distinctly asymmetrical. Adopting the carapace length 31 as the origin of 
coordinates, and using the same notation as before, we have the following results : — * 

fi\ = d (= q) = 16-191,382,8 ^ = 

fi/ s = 276-277,555 fi 2 = 14116,678,13 

fi\ = 4,963-876,753,5 fi 3 = 33*424,02673 

;jl\= 94,386*734,469 /x 4 = 1,288*640,094,26 

// 5 = 1,920,725-520,040 /x 6 == 16,752*563,9961 

X 4 = — 2072-394,903 

X 5 = — 36,102-605,1706. 

The standard-deviation of the group as a whole is given by a = \/^ or 

o-= 3*7572. 

The mean errort obtained from <x . . . . =2*9978 
„ „ „ directly . . . . = 2*8776. 

(In the case of the " foreheads " of Crabs, the mean error from <x was 3*8028, and 
directly 4*4087. This divergence between the mean error, as found practically from 
second and first moments, is a very good test of the asymmetry of the frequency- 
curve. In the very symmetrical measurements of " Crabs No. 4," the modulus, as 
calculated from the standard-deviation and from the mean error, had the near values 
3*874 and 3*863.) 

The curve obtained from the observations as a single group (i.e., d = 16*1914 and 
a = 3*7572) is given in fig. 4 (Plate 4). 

Taking ^ = ygp 2 we have for the fundamental nonic and its first differential 

AX) = X 9 /' (X) = 9X 8 

+ 24-177,940,535 x 7 + 169-245,583,743 x 6 

+ 1-675,748, 344 x 6 + 10-054,490,066 x 5 

+ 299-620,303,770 X 5 + 1498'101,518,851 x 4 

— 943-393,909,962 x 4 - 3773-575,639,850 x 3 

— 864-540 ; 147,350 x 3 - 2593-620,442,052 x 3 

— 274'750,163,918 x 2 - 549-500,327,835 x 

— 34-486,278,563 X — 34-486,278,563 

— 1-394,286,418 = 0. 

* These results were calculated to a higher degree of accuracy than in the case of the Crabs, a result 
rendered necessary by the apparent sensitiveness of the roots in this case to a slight change in the value 
of the coefficients of the nonic. 

f Mean error is here used, not in Gauss's sense, but in the sense of arithmetically mean error, 
= -7979 a theoretically. 
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Clearly there is only one positive root. This was found to be 



This gave 
whence I found 

Consequently the roots of 



x = 2*5868658. 
p 2 = 25-868,658, 
Pl = 9-669,970. 

y 3 -Ptf+P* = ° 



were imaginary and no solution involving the difference of two normal components 
was possible. 

The next stage was to find the negative roots. These were easily demonstrated to 
lie between and 1, and then it was shown that the value of f{\) on ^J changed sign 
twice between these values. Thus the nonic was proved, without calculating Sturm's 
functions, to have only three real roots. The two negative roots are : — 

Xi= — -154,481,14 
and 

Xa = — -078,262,95. 

These roots lead to the following solutions : — 

(A.) First additive Solution for Carapace of Prawns. 



jP2= - 


1-544,8114, 


Pl = 


26-758,0108, 


y x = — -057,6086, 


y 2 = 26-815,6194, 


z x — -997,856, 


« s = -002,144. 


1st Component. 


2nd Component. 


c l — 995,860, 


c a = 2-140, 


b x — — -057,6086; 


6 a = 26-815,6194, 


o-i = 3-5595, 


cr a = 5-7626 v 7 — 1 


y x = .111-6142. 


y 2 = imaginary. 



(B.) Second additive Solution for Carapace of Prawns. 

p. z = — -782,6295, 
p x = 5-163,5907, 

7i= — -147,3614, y % = 5-310,9521, 

%x = -973,0024, 2 3 = -026,9976. 
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1st Component. 


2nd Component. 


c x = 971-0564, 


c 2 = 26-9436, 


b x — — -147,3614, 


b 2 = 5-310,9521, 


o-j = 3-389,672, 


er 2 = 8-932,996, 


y x — 114-28698. 


y % — 1-203,280. 



To these solutions we may add :— 

(C.) Parameters of Normal Curve deduced from entire group of observations. 

d= 16*191,383, 
o = 998, 
<r = 3-7572, 

y = 105*968,04. 

(D.) Parameters of Normal Curve deduced by excluding tioo "giants" from 
observations. 

d = 16*14357 (6 = - -04781), 
c = 996, 
<r= 3*6051, 

y— 110-21786. 

The curves corresponding to (A), (B), (C), and (D) as well as the observation- 
curve are given in figs. 4 and 5. and I shall now proceed to discuss several important 
points with regard to them. 

(18.) The first point to be noted is the existence of the dwarf, carapace 27, and 
the giants, carapaces 65 and 69. 

The normal curve has a standard-deviation 3*7572, and the mean carapace being 
about 43, we have no less than three measurements deviating by more than four 
times the standard- deviation from the mean ; two of them, indeed, differ by nearly 
six times the standard-deviation from the mean. We might expect three such 
deviations of over four times the standard-deviation to occur in the measurement of 
50,000 Prawns, but they are extremely improbable in the measurement of 1000 
prawns. That two should occur in the measurement of 1000 Prawns, with a 
deviation six times the standard, is so improbable that it ought to lead us to reject 
the normal curve as a representation of the measurements. We are either dealing 
with a mixed population of Prawns, or possibly there are a few deformed individuals 
amid a .normal population."' 

There is another point, however, in which the normal curve, based on the total 

* I exclude the possibility of any serious error of measurement, having reason to "believe in the great 
care with which the determinations were made, 
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observations, diverges considerably from the observational result, namely (see fig. 4), 
in the defect of carapaces about 45. This defect largely contributes to the 
asymmetrical appearance of the curve. I felt very confident that by neglecting 
the eccentric group of " giants" I could find two components, whose resultant would 
fit the curve of observation as closely as the resultant-curves found for the similar 
case of the forehead of Crabs. I was peculiarly interested, however, in ascertaining 
whether the method of resolution by aid of the nonic would pay more attention to 
the outlying giants or to the less improbable defect of individuals about 45. I even 
imagined that out of the nine possible solutions some might be solutions for the 
giants and some for the 45 defect. As a matter of fact, the two solutions which 
have any meaning are entirely taken up with the very improbable outlying eccen- 
tricities of the observations. These eccentricities must first be removed from the 
observations before the method will be of service in resolving the asymmetry of the 
bulk of the observation-curve. 

The method in which the nonic deals with the abnormalities is very characteristic, 
and I venture to think highly suggestive. 

In fig. 4 the normal curve excluding the two giants is given. It fits the observa- 
tion-curve, as far as appearances go, slightly better than the true normal curve. 
But the first solution of the nonic tells us not to absolutely reject the giants. It 
gives us two components, the first of which fits the observations slightly better than 
the normal curve D (giants excluded). It has practically the same area (995*86 as 
compared with 996), a slightly less standard-deviation (3*5595 as compared with 
3 k 6051), and consequently an increased maximum ordinate. This, with a slightly 
shifted axis, gives a somewhat better fit. In addition to this first component we have 
a second component with an area of 2*140, and a mean of 70 for the carapace. This 
component corresponds closely to the two giants with a mean of 67. It has, how- 
ever, an imaginary standard-deviation. Clearly the addition of two to the first 
component, if distributed really, could make no sensible change in its appearance, 
and we may then sum up the first solution of the nonic in the following words :— 

It does not absolutely reject the two giants, but places an imaginary distribution of 
2*14 in their neighbourhood, and thus obtains for the other component and the 
resultant-curve (which must be practically identical with it) a better approach to the 
observation-curve than if the giants had been rejected. 

It would appear, therefore, that our method of dissection offers, by means of 
small components with imaginary distributions, a means of obtaining better results 
than by simply rejecting (or, perhaps, even weighting) anomalous observations. 

The second method by which the nonic attempts to account for the eccentricities of 
these carapace measurements, is by mixing a small population of about 2*7 per cent, of 
giants with the normal population. These giants have a mean carapace of 48*5, while 
the rest of the population has a mean of only 43. This population of giants, however, 
has a very large standard-deviation, i.e., 8'9330 as compared with the 3*3897 of the 

MDCCOXCIV. — A . P 
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rest of the population. It is clear that this population of giants is an unstable 
population, i.e., a very small disturbance would largely change its centre. That it 
accounts for and covers the dwarf and two giant anomalies is clear, and the resultant- 
curve, based on the addition of the two components, is a fairly close approach to the 
observation-curve— far closer indeed than that provided by the first solution, and a 
great advance on the normal-curve C, resulting from the observations as a whole (see 
fig. 5). I am inclined, accordingly, to suspect that the family of Prawns was not 
homogeneous, but contained between 2 and 3 per cent, of a giant population with a 
large standard deviation. Possibly the theory of correlations may settle whether this 
is the real state of the case, or whether the anomalies referred to ought to be rejected 
and a new investigation made to dissect the asymmetrical curve for the carapaces 
when the outlying parts, which control the nonic at present, are removed. 

The investigation of this case, however, with all the observations included, shows 
the great variety of solutions which may be suggested by the dissection of various 
anomalous and asymmetrical frequency-curves. 

Table I. — Powers of the Natural Numbers. 



Powers. 


First. 
1 


Second. 


Third. 
1 


Fourth. 
1 


Fifth. 

1 


Sixth. 


1 


1 


2 


4 


8 


16 


32 


64 


3 


9 


27 


81 


243 


729 


4 


16 


64 


256 


1,024 


4,096 


5 


25 


125 


625 


3,125 


15,625 


6 


36 


216 


1,296 


7,776 


46,656 


1-7 


49 


343 


2,401 


16,807 


117,649 


8 


64 


512 


4,096 


32,768 


262,144 


9 


81 


729 


6,561 


59,049 


531,441 


10 


100 


1,000 


10,000 


100,000 


1,000,000 


11 


121 


1,331 


14,641 


161,051 


1,771,561 


12 


144 


],728 


20,736 


248,832 


2,985,984 


13 


169 


2,197 


28,561 


371,293 


4,826,809 


14 


196 


2,744 


38,416 


537,824 


7,529,536 


15 


225 


3,375 


50,625 


759,375 


11,390,625 


16 


256 


4 ; 096 


65,536 


1,048,576 


16,777,216 


17 


289 


4,913 


83,521 


1,419,857 


24,137,569 


18 


324 


5,832 


104,976 


1,889,568 


34,012.224 


19 


361 


6,859 


130,321 


2,476,099 


47,045,881 


20 


400 


8,000 


160,000 


3,200,000 


64,000,000 


21 


441 


9,261 


194,481 


4,084,101 


85,766,121 


22 


484 


10,648 


234,256 


5,153,632 


113,379,904 


23 


529 


12,167 


279,841 


6,436,343 


148,035,889 


24 


576 


13,824 


331,776 


7,962,624 


191,102,976 


25 


625 


15,625 


390,625 


9,765,625 


244,140,625 


26 


676 


17,576 


456,976 


11,881,376 


308,915,776 


27 


729 


19,683 


531,441 


14,348,907 


387,420,489 


28 


784 


21,952 


614,656 


17,210,368 


481,890,304 


■29 


841 


24,389 


707,281 


20,511,149 


594,823,321 


30 


900 


27,000 


810,000 


24,300,000 


729,000,000 
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Table IT. — Ordinates of Normal Curve. 

D = Deviation. S = Standard Deviation. 

F = Frequency. P = Maximum Frequency I — 7?r0 # 



D/S. 


F/P. 


D/S. 


F/P. 





1 


1-6 


2780 


o-i 


•9950 


1-7 


2357 


0-2 


•9802 


1-8 


1979 


0-3 


•9560 


1-9 


•1645 


0-4 


•9231 


2 


1353 


0-5 


•8825 


2-2 


•0889 


0-6 


•8353 


2-4 


0561 


0-7 


•7827 


2-6 


•0340 


0-8 


•7262 


2-8 


■0198 


0*9 


•6670 


3 


0111 


1 


•6065 


3-2 


0060 


1-1 


•5467 


3-4 


•0031 


" 1-2 


•4868 


3-6 


•0015 


1-3 


•4286 


3-8 


0007 


1-4 


•3753 


4 


•0003 


1-5 


•3246 


5 


•000,004 



[Note, added February 10, 1894.— (1.) The importance of breaking up asymmet- 
rical frequency-curves into normal components has been recognized for a long time 
by anthropologists and biologists. Attempts at a solution have been made by 
R. Livi, 'Sulla statura degli Italiani,' Firenze, 1883 (see also ' Archivio per 
V Antropologia e VEtnologia? vol. 13, Firenze, 1883, and ' Annali di Statistical 
vol. 8, 1883, pp. 119-56). Also by O. Ammon in his recent work 'Die naturliche 
Auslese heim Menschenj Jena, 1893. These attempts can hardly be looked upon 
as serious. Professor Lexis and Dr. Venn have pointed out that the curve of 
deaths for each year for 1000 persons born in the same year — the true mortality- 
curve — is also in all probability a compound curve. 

Since writing the above memoir I have succeeded in resolving this mortality-curve 
into components which are not, however, all of the normal type, but become, as we 
approach infantile mortality, of the skew form (see p. 74 above). 

O. Ammon, in the volume cited above, endeavours to demonstrate an evolution in 
the length-breadth index of the skull of South-Germans since primitive times. He 
does this by comparison of the index as obtained from measurements on skulls from 
the Row-Graves and on modern skulls. He has not, however, noticed that the 
frequency-curve for Row-Grave skulls is asymmetrical. I have succeeded in 
breaking it up into two components, one of which practically coincides in mean 
and standard-deviation with the frequency-curve for the skulls of modern South- 

P 2 
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Germans. In other words, the Row-Graves contain a mixed population, one element of 
which corresponds closely to the modern South-German population. Ammon's state- 
ment, therefore, that an evolution has taken place in this particular skull index appears 
to fall to the ground. The whole problem of the compound nature of skull frequency- 
curves, both in England and Germany, is a very interesting and difficult one, and I 
do not wish at present to anticipate results, which I hope when my investigations 
are complete to publish as a whole. The above may suffice to indicate the range of 
problems to which a resolution of asymmetrical frequency-curves into normal 
components may be applied. 

(2.) With regard to the method adopted in the memoir itself, I am very conscious 
of the defects under which it suffers—the laborious character of the arithmetic 
involved, and the question of what may be the probable error of the solution obtained 
by the method of higher moments. But I had to deal with the fact that the problem 
is one which urgently needed a solution in the case of both economic and biological 
statistics. Better solutions than mine may be ultimately found, but although more 
than one mathematically trained statistician has for some time recognized the impor- 
tance of the problem, no solution, so far as I am aware, has hitherto been forthcoming. 

With regard to the amount of error introduced by the use of higher moments, a 
word may be said. I have not been able to work out the general problem suggested 
to me by Professor George Darwin : " Given the probable error of every ordinate 
of a frequency-curve, what are the probable errors of the elements of the two normal 
curves into which it may be dissected ? " 

I can, however, indicate the sort of differences which are likely to occur in results 
based on high or on low moments. Suppose the distribution of an organ in a group 
of animals actually does follow a normal frequency-curve. Then it is obvious that in 
selecting 1000 of these animals at random and measuring their organs, an error of the 
same magnitude in the frequency of an organ of a given size is more likely to occur in 
a size near the mean than in a size far from the mean. Now a low moment pays 
greater attention than a high moment to an error in the frequency near the mean 
and less attention than a high moment to one far off. In other words, a frequency- 
curve calculated from low moments fits best near the centre ; one calculated from 
high moments fits best near the tails of the observation-curve. The problem is 
accordingly the following : an error in frequency near the tail is not as probable as an 
equal error in frequency near the mean ; but if it does occur a high moment pays 
much more attention to it than a low moment ; on the other hand, the low moment 
pays more attention than the high moment to more probable errors in frequency. 
Which tendency on the whole will prevail ? 

Turning to the result in the foot-note, p. 92, we have for the 2r th moment- 
Ms, = (2r - 1) (2r — 3) . . . 5.3.1 o^c, 

and 

M». = S (x 2r y 8a?). 



GENERAL EQUATION OF A CUBIC SURFACE. 



45 



Then the equation (A) takes the form 

xyzu = (x — aT) (y — 6T) (z — cT) (u — c£T) .... (0)5 

and it represents, besides the plane T, the cubic surface passing through the twelve 
straight lines, which are represented in the annexed figure, as well as three other 
straight lines which are not represented in the figure. 







C 



The equations of the lines may be written as follows :- 



X ~~~ 01/ X. 



W 



y = 

z =cT 
x = 

u = 




Z 

y 



aT 


6T 



H 2 ) 



= 



(4) 



H 8 ) 



z = cT 

y = 



>(9) 



= dT 1 



x = 



= 01 



, x u = dT I v u = dT , 
(10) ^ J-(ll) ^ |>(12) 



2/ = 



2=0 



anci 



D 



x = 
u = 



aT] 
H 3 ) 



y.= 6T 



u 

z 
u 




a * d 



7 "•" 

e 



m 



T 



(13). 



*-» 



which meets (3), (4), (9), and (10) ; 
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moments respectively, I notice tbe following values for the standard-deviation of 
" Crabs No. 4," as calculated from the second, fourth, and sixth moments — 

a- 2 = 274, 

cr 4 = 277, 
cr 6 = 2*84. 

Practically, it would be difficult to say which of these results gives the best fitting 
theoretical curve. For statistics of this kind they are sensibly the same. Thus, till 
another method of attacking the problem of the resolution of asymmetrical frequency- 
curves is propounded, I think there is not sufficient evidence against the use of higher 
moments to lead us to discard a method based upon them as essentially likely to lead 
to large errors,— K. P.] 
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